Gaussian Approximation for the Sup - Norm of High - Dimensional Matrix - Variate U - Statistics and Its Applications
نویسنده
چکیده
This paper studies the Gaussian approximation of high-dimensional and non-degenerate U-statistics of order two under the supremum norm. We propose a two-step Gaussian approximation procedure that does not impose structural assumptions on the data distribution. Specifically, subject to mild moment conditions on the kernel, we establish the explicit rate of convergence that decays polynomially in sample size for a high-dimensional scaling limit, where the dimension can be much larger than the sample size. We also supplement a practical Gaussian wild bootstrap method to approximate the quantiles of the maxima of centered U-statistics and prove its asymptotic validity. The wild bootstrap is demonstrated on statistical applications for high-dimensional non-Gaussian data including: (i) principled and data-dependent tuning parameter selection for regularized estimation of the covariance matrix and its related functionals; (ii) simultaneous inference for the covariance and rank correlation matrices. In particular, for the thresholded covariance matrix estimator with the bootstrap selected tuning parameter, we show that the Gaussian-like convergence rates can be achieved for heavy-tailed data, which are less conservative than those obtained by the Bonferroni technique that ignores the dependency in the underlying data distribution. In addition, we also show that even for subgaussian distributions, error bounds of the bootstrapped thresholded covariance matrix estimator can be much tighter than those of the minimax estimator with a universal threshold.
منابع مشابه
On Conditional Applications of Matrix Variate Normal Distribution
In this paper, by conditioning on the matrix variate normal distribution (MVND) the construction of the matrix t-type family is considered, thus providing a new perspective of this family. Some important statistical characteristics are given. The presented t-type family is an extension to the work of Dickey [8]. A Bayes estimator for the column covariance matrix &Sigma of MVND is derived under ...
متن کاملMatrix-Variate Beta Generator - Developments and Application
Matrix-variate beta distributions are applied in different fields of hypothesis testing, multivariate correlation analysis, zero regression, canonical correlation analysis and etc. A methodology is proposed to generate matrix-variate beta generator distributions by combining the matrix-variate beta kernel with an unknown function of the trace operator. Several statistical characteristics, exten...
متن کاملA comparison of alternative approaches to sup-norm goodness of fit tests with estimated parameters
Goodness of fit tests based on sup-norm statistics of empirical processes have nonstandard limiting distributions when the null hypothesis is composite — that is, when parameters of the null model are estimated. Several solutions to this problem have been suggested, including the calculation of adjusted critical values for these nonstandard distributions and the transformation of the empirical ...
متن کاملThe Brownian Frame Process as a Rough Path
The Brownian frame process T B is defined as T B t := (Bt−1+u)0≤u≤1 , t ∈ [0, 1] , where B is a real-valued Brownian motion with parameter set [−1, 1]. This thesis investigates properties of the path-valued Brownian frame process relevant to establishing an integration theory based on the theory of rough paths ([Lyons, 1998]). The interest in studying this object comes from its connection with ...
متن کاملTransposable Regularized Covariance Models with Applications to High-dimensional Data a Dissertation Submitted to the Department of Statistics and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
High-dimensional data is becoming more prevalent with new technologies in biomedical sciences, imaging and the Internet. Many examples of this data often contain complex relationships between and among sets of variables. When arranged in the form of a matrix, this data is transposable, meaning that either the rows, columns or both can be treated as features. To model transposable data, we prese...
متن کامل